REPORT  DOCUMENTATION  PAGE 


Form  Approved 
0MB  No.  0704-0188 


The  public  reporting  buiden  for  this  collection  of  information  is  estimated  to  everege  t  hour  per  lesponse,  including  the  tim 
gatheiing  and  maintaining  the  date  needed,  and  completing  and  reviewing  the  collection  of  information,  Send  comments  regardi 
information,  including  suggestions  for  reducing  the  burden,  to  the  Department  of  Defense,  Executive  Services  and  Communic 
that  notwithstanding  any  othei  provision  of  law,  no  person  shall  be  subject  to  any  penalty  foi  failing  to  comply  with  a  collec 
control  numbei. 

PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ORGANIZATION. 

le  for  reviewing  instructions,  searching  existing  data  sources, 
ng  this  burden  estimate  or  eny  other  aspect  of  this  collection  of 
etions  Directorete  {0704-01881.  Respondents  should  be  aware 
tion  of  information  if  it  does  not  display  a  cuirently  valid  0MB 

1.  REPORT  DATE  (DD-MM-YYYY}  2.  REPORT  TYPE 

08-10-2009  Journal  Article 

3.  DATES  COVERED  (From  -  To) 

4.  TITLE  AND  SUBTITLE 

Super-Ensemble  Techniques:  Application  to  Surface  Drift  Prediction  During 
the  DART06  and  MREA07  Campaigns 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

060153N 

6.  AUTHOR(S) 

L.  Vandcnbulcke,  J.  Beckers,  F.  Lenartz,  A.  Barth,  P.  Poulain,  M.  Aidonidis, 

J.  Meyrat,  F.  Ardhuin,  M.  Tonani,  C.  Fratianni,  L.  Torrisi,  D.  Pallela,  J. 
Chiggiato,  M.  Tudor,  J.  Book,  P.  Martin,  G.  Peggion,  M.  Rixen 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

73-9858-A8-5 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS{ES) 

Naval  Research  Laboratory 

Oceanography  Division 

Stennis  Space  Center,  MS  39529-5004 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

NRL/JA/7330-09-9070 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESSIES) 

Office  of  Naval  Research 

800  N.  Quincy  St. 

Arlington,  VA  22217-5660 

10.  SPONSOR/MONITOR  S  ACRONYM(S} 

ONR 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBERIS) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release,  distribution  is  unlimited. 


13.  SUPPLEMENTARY  NOTES 


20091 021 255 


14.  ABSTRACT 

The  prediction  of  surface  drift  of  floating  objects  is  an  important  task,  w  ith  applications  such  as  marine  transport,  pollutant  dispersion,  and  s e arc h-and* rescue 
activities.  But  forecasting  even  the  drift  of  surface  waters  is  ver>  challenging,  because  it  depends  on  complex  interactions  of  currents  driven  by  the  wind,  the  wave 
Held  and  the  general  prevailing  circulation.  Furthermore,  although  each  of  those  can  be  forecasted  by  deterministic  models,  the  latter  all  suffer  from  limitations, 
resulting  in  imperfect  predictions.  In  the  present  study,  we  try  and  predict  the  drift  of  two  huoys  launched  during  the  DART06  (Dynamics  of  the  Adriatic  sea  in 
Real-Time  2006)  and  MRFA07  (Maritime  Rapid  Environmental  Assessment  2007)  sea  trials,  using  the  so-called  hyper-ensemhle  technique:  different  models  are 
combined  in  order  to  minimize  departure  from  independent  observations  during  a  training  period;  the  obtained  combination  is  then  used  in  forecasting  mode.  We 
review  and  try  out  different  hyper-ensemble  techniques,  such  as  the  simple  ensemble  mean,  least-squares  weighted  linear  combinations,  and  techniques  based  on  data 
assimilation,  which  dynamically  update  the  nioders  weights  in  the  combination  when  new  observations  become  available.  We  show-  that  the  latter  methods  alleviate 
the  need  of  fixing  the  training  length  a  priori,  as  older  information  is  automatically  discarded 


15.  SUBJECT  TERMS 

super-ensemble,  multi-model,  surface  drift 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 
ABSTRACT 

18.  NUMBER 

19a.  NAME  OF  RESPONSIBLE  PERSON 

a.  REPORT 

b.  ABSTRACT 

c.  THIS  PAGE 

OF 

PAGES 

19 

Jeffrey  Book 

Unclassified 

Unclassified 

Unclassified 

UL 

19b.  TELEPHONE  NUMBER  (hdude  area  code) 

228-688-5251 

Standard  Form  298  (Rev.  8/98) 

Prescribed  by  ANSI  Sid.  Z39.18 


Progress  in  Oceanography  82  (2009)  149-167 


Contents  lists  available  at  ScienceDirect 

J' Progress  in  Oceanography 

I  I  SI  \  I  r  R  journal  homepage:  www.elsevier.com/Iocate/pocean 


Super-ensemble  techniques:  Application  to  surface  drift  prediction 

L  Vandenbulcke^*,  J.-M.  Beckers^,  F.  Lenartz^,  A.  Barth ^  P.-M.  Poulain*’,  M.  Aidonidis*^,  J.  MeyratS 
F.  Ardhuin*^,  M.  Tonani**,  C.  Fratianni'*,  L  Torrisi^  D.  Pallela^J.  Chiggiato^^,  M.  Tudor®,  J.W.  Book'’, 
P.  Martin'’,  G.  Peggion',  M.  Rixen-* 

*  CeoHydrodynamics  ond  Environment  Research,  C/niversif#  de  U^e.  Bat.  B5.  17.  Ait^e  du  6  AoQt,  4000  Li^ge.  Belgium 

^Istituta  Naziavale  di  Oceanagrafia  e  di  Geafisica  sperimentale  (OGS),  Trieste.  Italy 

^Service  Hydrographique  et  Oc^nagraphique  de  la  Marine.  13  rue  du  ChofeWier,  29200  Brest,  France 

‘^/sfifufo  Nazionole  di  Geafisica  e  Vu/cono/ogia  Balagna.  Italy 

^Palazza  AM..  Wo/e  deH’Lfniversito’  4.  000185  Rama.  Italy 

^ARPA  Emilia  Komogna  Servizio  Idro  Metearolagica.  Wo/e  Si/voni  6.  40122  Balagna.  Italy 
*DHMZ  Metearolagical  and  Hydrolagical  Service,  Zagreb,  Croatia 
US  Naval  Research  Labaratary,  Stennis  Space  Center.  MS  39529,  United  States 
‘University  of  New  Orleans,  2000  Lakeshore  Dr.,  New  Orleans.  LA  70148,  United  Stotes 
^  NATO/SACLANT  Undersea  Research  Centre.  La  Spezia.  Italy 


ARTICLE  INFO 


ABSTRACT 


Article  history; 

Received  22  January  2009 
Received  in  revised  form  3  June  2009 
Accepted  8  June  2009 
Available  online  13  June  2009 


Keywords; 
Supier-ensemble 
Multi-model 
Surface  drift 


The  prediction  of  surface  drift  of  floating  objects  is  an  important  task,  with  applications  such  as  marine 
transport,  pollutant  dispersion,  and  search-and-rescue  activities.  But  forecasting  even  the  drift  of  surface 
waters  is  very  challenging,  because  it  depends  on  complex  interactions  of  currents  driven  by  the  wind, 
the  wave  field  and  the  general  prevailing  circulation.  Furthermore,  although  each  of  those  can  be  fore¬ 
casted  by  deterministic  models,  the  latter  all  suffer  from  limitations,  resulting  in  imperfect  predictions. 
In  the  present  study,  we  try  and  predict  the  drift  of  two  buoys  launched  during  the  DART06  (Dynamics 
of  the  Adriatic  sea  in  Real-Time  2006)  and  MREA07  (Maritime  Rapid  Environmental  Assessment  2007) 
sea  trials,  using  the  so-called  hypier-ensemble  technique:  different  models  are  combined  in  order  to  min¬ 
imize  departure  from  independent  observations  during  a  training  period;  the  obtained  combination  is 
then  used  in  forecasting  mode.  We  review  and  try  out  different  hyper-ensemble  techniques,  such  as 
the  simple  ensemble  mean,  least-squares  weighted  linear  combinations,  and  techniques  based  on  data 
assimilation,  which  dynamically  upKlate  the  model’s  weights  in  the  combination  when  new  observations 
become  available.  We  show  that  the  latter  methods  alleviate  the  need  of  fixing  the  training  length  a  priori, 
as  older  information  is  automatically  discarded. 

When  the  forecast  period  is  relatively  short  (12  h),  the  discussed  methods  lead  to  much  smaller  fore¬ 
casting  errors  compared  with  individual  models  (at  least  three  times  smaller),  with  the  dynamic  methods 
leading  to  the  best  results.  When  many  models  are  available,  errors  can  be  further  reduced  by  removing 
colinearities  between  them  by  performing  a  principal  component  analysis.  At  the  same  time,  this  reduces 
the  amount  of  weights  to  be  determined. 

In  complex  environments  when  meso-  and  smaller  scale  eddy  activity  is  strong,  such  as  the  Ligurian 
Sea.  the  skill  of  individual  models  may  vary  over  time  periods  smaller  than  the  forecasting  period  (e.g. 
when  the  latter  is  36  h).  In  these  cases,  a  simpler  method  such  as  a  fixed  linear  combination  or  a  simple 
ensemble  mean  may  lead  to  the  smallest  forecast  errors.  In  environments  where  surface  currents  have 
strong  mean-kinetic  energies  (e.g.  the  Western  Adriatic  Current),  dynamic  methods  can  be  particularly 
successful  in  predicting  the  drift  of  surface  waters.  In  any  case,  the  dynamic  hyper-ensemble  methods 
allow  to  estimate  a  characteristic  time  during  which  the  model  weights  are  more  or  less  stable,  which 
allows  predicting  how  long  the  obtained  combination  will  be  valid  in  forecasting  mode,  and  hence  to 
choose  which  hyp>er-ensemble  method  one  should  use. 
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1.  Introduction 

The  prediction  of  the  drift  of  objects  floating  at  the  surface  of  the 
ocean  has  various  applications,  for  example  tracking  of  floating 
mines  or  pollutants  such  as  tar  balls,  dispersion  of  algae  blooms. 
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marine  transport,  search-and-rescue  activities,  etc.  However,  due  to 
multiple  reasons  whose  effects  add  up,  drift  prediction  remains  a 
very  challenging  task.  Even  small  errors  in  estimation  can  drastically 
change  the  subsequent  particle  trajectories  (Griffa  et  al.,  2004).  Even 
when  one  predicts  the  drift  of  buoys  configured  to  closely  track  the 
drift  of  surface  waters,  and  hence  only  the  ocean  current  should  be 
taken  into  account,  it  is  still  useful  to  also  take  surface  wind,  waves, 
tides,  etc.  into  consideration.  Indeed,  most  ocean  current  models  do 
not  include  wave-driven  currents  at  all,  and  wind-driven  currents 
are  not  fully  accurate.  However,  all  these  currents  contribute  (in  a 
complex  way)  to  the  real  surface  water  drift,  and  furthermore  inter¬ 
act  with  one  another.  Thus  missing  dynamics  in  ocean  current  mod¬ 
els  can  be  partially  accounted  for  using  empirical  methods  with 
direct  model  predictions  of  the  forcing  fields  (winds  and  waves) 
for  these  dynamics.  Finally,  when  one  tries  to  predict  the  drift  of  var¬ 
ious  floating  objects,  other  parameters  should  be  considered,  such  as 
the  specific  hydrodynamic  drifter  response. 

Even  though  most  of  these  contributions  can  be  forecast  by 
deterministic  models  (albeit  with  some  limitations  inherent  to 
the  models),  there  is  not  yet  a  deterministic  method  to  combine 
them  in  order  to  reproduce  the  floating  object  drift;  completely- 
coupled  deterministic  models  that  take  all  these  processes  into 
consideration  are  just  now  under  development.  In  the  present 
study,  we  instead  use  multi-model  methods  to  try  and  empirically 
combine  individual  models  of  different  processes  that  are  all  di¬ 
rectly  or  indirectly  related  to  surface  drift.  Super-ensembles  (SE), 
which  combine  different  models  of  the  same  physical  processes, 
were  applied  within  the  atmospheric  community  by  Krishnamurti 
et  al.  (1999)  some  years  before  the  oceanic  community  took  on. 
Other  atmospheric  studies  followed,  see  e.g.  Shin  and  Krishnamurti 
(2003a, b);  Williford  et  al.  (2003);  Yun  et  al.  (2003,  2005);  Mutemi 
et  al.  (2007).  Nowadays,  other  communities  also  apply  the  tech¬ 
nique  (e.g.  oceanography,  hydrology,  paleoclimatology,  etc.),  as 
they  all  realize  its  low  cost,  but  large  benefit.  Generally  speaking, 
the  technique  could  be  applied  to  every  field  where  different  con¬ 
current  models  aim  at  predicting  the  same  variable,  or  even  where 
different  models  predict  different  variables  which  are  all  somehow 
related  to  the  desired  output  variable.  In  the  latter  case,  the  tech¬ 
nique  is  rather  called  hyper-ensemble  (HE);  it  was  first  introduced 
in  the  oceanic  community  (Rixen  and  Ferreira-Coelho,  2007). 

In  the  present  study,  we  forecast  surface  drift  using  linear  HE 
methods  both  with  static  and  dynamic  weights,  the  latter  allowing 
the  weights  to  evolve  smoothly  in  time.  Section  2  is  devoted  to  the 
description  of  the  models  and  observational  data  used  in  two 
experiments:  the  DART06  sea  trial  in  the  Adriatic  Sea,  and  the 
MREA07  campaign  in  the  Ligurian  Sea.  The  HE  methods  are  de¬ 
scribed  in  Section  3.  We  will  then  focus  on  two  case  studies,  one 
where  the  drifter  is  predominately  influenced  by  a  mean-kinetic- 
energy  environment  (the  Western  Adriatic  Current)  and  one  where 
the  drifter  is  predominately  influenced  by  an  eddy-kinetic-energy 
environment  (Ligurian  Sea).  The  results  are  then  shown  in  Section 
4  and  a  summary  and  the  conclusions  are  given  in  Section  5. 


2.  Models  and  data 

Surface  drift  of  floating  objects  depends  on  various  factors.  It  is 
strongly  determined  by  the  ocean  surface  currents.  However,  the 
hydrodynamic  models  used  to  forecast  the  currents  have  chaotic 
components,  have  incomplete  representations  of  the  underlying 
physics,  and  have  uncertainties  on  forcing  fields  and  model  param¬ 
eters.  For  a  complete  discussion  of  error  causes  in  hydrodynamic 
models,  see  e.g.  Lermusiaux  et  al.  (2006).  The  hydrodynamic  mod¬ 
els  used  in  both  experiments  have  high  resolutions  (between  1/16° 
and  1/100°),  and  therefore  represent  many  smaller  scale  processes 
that  are  difficult  to  correctly  phase  and  forecast.  The  fact  that  the 


models  have  energies  at  such  scales  is  ultimately  important  for 
successful  HE  modeling,  but  phase  problems  can  easily  lead  to 
higher  forecast  errors  than  for  lower  resolution  models  (no  energy 
at  these  scales)  if  the  higher  resolution  models  are  not  corrected  in 
some  way.  On  top  of  this,  even  with  this  high  resolution,  many 
phenomena  at  yet  smaller  scales  are  not  represented,  whereas 
the  real  surface  drift  depends  on  every  scale  present. 

Paldor  et  al.  (2004)  shows  that  instantaneous  winds  have  more 
influence  on  surface  drift  than  climatic  surface  currents;  Rixen  and 
Ferreira-Coelho  (2007)  confirm  this  by  showing  that  in  an  atmo¬ 
spheric-oceanic  hyper-ensemble,  the  (weighted)  wind  model  has 
more  importance;  ocean  advection  has  less  impact.  However,  the 
wind-driven  surface  current  is  still  poorly  understood.  Observa¬ 
tions  show,  in  addition  to  inertial  oscillations,  a  drift  of  the  order 
of  2-4%  of  the  wind  speed  with  directions  that  vary  from  0°  to 
30°  to  the  right  of  the  wind  in  the  Northern  Hemisphere,  and  to 
the  left  in  the  Southern  Hemisphere  (Tsahalis,  1979).  These  varia¬ 
tions  may  be  understood  as  the  combination  of  a  wave-induced 
Stokes  drift,  roughly  aligned  with  the  wind,  and  a  drift  due  to  the 
wind-driven  current.  The  magnitude  and  deflection  angle  of  this 
current  depend  strongly  on  the  vertical  structure  of  turbulence. 
For  example,  the  classical  Ekman  (1905)  theory  with  a  constant 
eddy  viscosity  give  a  45°  deflection  angle,  while  linear  eddy  viscos¬ 
ity  profiles  give  deflections  of  the  order  of  10°  (Madsen,  1977).  Re¬ 
cent  evidence  for  strong  mixing  in  the  upper  ocean  [e.g.  (Agarwal 
et  al.,  1992)]  suggest  that  the  eddy  viscosity  profile  may  be  piece- 
wise-linear  with  a  strong  surface  value.  This  should  produce  a  sur¬ 
face  current  limited  to  about  0.5%  of  the  wind  speed  in  open  ocean 
conditions  without  stratification,  and  about  1%  with  a  strong  strat¬ 
ification.  Given  that  the  surface  Stokes  drift  (see  below)  is  of  the  or¬ 
der  of  1.2%  of  the  wind  speed,  the  total  surface  drift  explained  by 
models  with  realistic  mixing  is  of  the  order  of  2%  of  the  wind  speed 
(Rascle  et  al.,  2006;  Rascle  and  Ardhuin,  2009).  This  is  generally  on 
the  low  side  of  the  reported  values  for  surface  drift.  This  difference 
may  be  due  to  fetch  variations  (e.g.  laboratory  compared  to  field 
conditions),  convergence-related  biases  (such  as  caused  by  Lang¬ 
muir  circulations)  or  yet  unknown  processes.  As  a  "rule-of-thumb", 
we  will  consider  that  the  wind  sets  up  a  surface  current  of  roughly 
3%  of  the  wind  speed,  15°  to  the  right  of  the  downwind  direction. 
But  similarly  to  the  ocean  models  mentioned  before,  the  atmo¬ 
spheric  models  used  to  forecast  the  wind  field  suffer  of  their  own 
limitations:  they  are  also  chaotic,  also  have  only  an  incomplete 
representation  of  the  real  atmospheric  physics,  etc. 

The  wave  theory  leads  to  the  so-called  Stokes  drift,  which  in¬ 
duces  a  movement  of  water  particles  in  the  direction  of  the  waves. 
The  displacement  velocity  depends  on  the  ratio  of  wave  height  and 
wavelength;  it  also  strongly  decreases  with  depth  and  becomes 
negligible  at  a  depth  equal  to  a  fourth  of  the  wavelength.  The  Cori¬ 
olis  force  induces  yet  another  net  transport,  the  so-called  Hassel- 
mann  drift,  which  depends  on  the  turbulence,  and  has  a  direction 
opposed  to  the  Stokes  drift.  The  sum  of  vertically-integrated  net 
transports  of  the  Stokes  and  Hasselmann  drifts  is  zero.  leading  to 
a  zero  net  water  transport.  However,  the  different  vertical  profiles 
for  Stokes  and  Hasselmann  drifts  indicate  that  the  former  is  more 
important  than  the  latter  at  the  surface,  leading  to  a  net  surface 
transport  in  the  direction  of  the  waves  (below  the  surface,  there 
is  a  transport  in  the  opposed  direction). 

Finally,  surface  drift  still  depends  on  other  phenomena  such  as 
tides. 

Most  of  the  drifters  used  in  the  DART06  and  MREA07  experi¬ 
ments  were  CODE  drifters  manufactured  by  Technocean  (model 
Argodrifter).  CODE  designs  were  developed  by  Davis  (1985)  to 
measure  the  currents  in  the  first  meter  under  the  sea  surface.  More 
details  about  these  drifters  can  be  found  in  Poulain  (1999)  and 
Ursella  et  al.  (2006).  Measurements  with  dye  (D.  Olsen,  Personal 
Communication)  and  through  direct  measurements  of  relative  flow 
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(P.-M.  Poulain,  Personal  Communication)  revealed  that  the  CODE 
drifters  follow  the  surface  currents  to  within  2-3  cm/s.  The 
wind-driven  components  of  the  CODE  drifter  velocities,  including 
Ekman  currents  and  slip,  were  recently  assessed  by  Poulain  et  al. 
(2009)  and  related  statistically  to  ECMWF  winds.  Using  complex 
linear  regression  models,  they  found  that  the  wind-driven  currents 
amount  to  1%  of  wind  speed  and  are  rotated  by  28^*  to  the  right  of 
the  wind. 

The  majority  of  the  drifters  were  localized  by  Glob  al  Position¬ 
ing  System  (GPS)  at  hourly  intervals.  Their  data  were  telemetered 
via  the  Argos  system  orbiting  on  the  NOAA  satellites.  The  drifter 
positions  were  edited  for  outliers  using  automatic  statistical  and 
manual  procedures  ( Barba nti  et  al.,  2007;  Ursella  et  al.,  2006). 

Finally,  lets  note  that  the  HE  methods,  inclusive  the  “tricks” 
(discussed  in  Section  3),  might  actually  also  account  for  the  slip 
and  leeway  response  of  the  particular  drifters  considered. 

2.7.  DART06  experiment 

We  first  try  and  predict  the  displacement  of  drifters  launched  in 
the  Adriatic  Sea  during  the  DART06  sea  trials;  drift  data  for  the  re¬ 
gion  were  compiled  by  Veneziani  et  al.  (2007).  During  this  cam¬ 
paign,  extensive  data  sets  were  collected  by  multiple  means,  and 
made  available  in  near  real-time.  Drifters  were  launched  and  data 
was  made  available  in  near  real-time  by  Istituto  Nazionale  di 
Oceanografia  e  di  Geofisica  sperimentale  (OGS)  and  the  NATO/SAC- 
LANT  Undersea  Research  Centre  (NURC).  Model  predictions  of  the 


Gargano  region  (41°45  N,  16°E)  were  used  to  direct  the  launching 
of  pairs  of  drifters  with  the  goal  of  maximizing  the  coverage  of 
the  sampling  area.  Some  drifters  were  found  to  separate  at  loca¬ 
tions  and  in  the  directions  given  by  the  model  finite-size  Lyapunov 
exponents  (FSLE)  (Haza  et  al.,  2007).  The  trajectories  are  shown  in 
Fig.  1;  we  will  focus  only  on  drifter  a06956  (Barbanti  et  al.,  2007) 
flowing  around  the  Gargano  peninsula  as  it  exhibits  a  typical 
behavior.  We  consider  only  the  first  week  of  the  drifter  trajectory, 
as  afterward  at  least  one  model  does  not  cover  the  area  anymore. 

At  the  same  time,  a  wide  range  of  atmospheric,  ocean  and  wave 
models  were  provided  operationally.  However,  increasing  the  com¬ 
plexity  of  the  problem  could  lead  to  less  accurate  results  if  over-fit- 
ting  occurs  (Everitt,  2002),  and  hence  only  two  wind  models  and 
two  hydrodynamic  models  are  used  in  the  HE  combinations  (i.e. 
no  wave  models  are  used).  The  following  models  were  used  in 
the  present  study: 

1.  Meteo  France  Aladin,  output  fields  provided  by  the  Service 
Hydrographique  et  Oceanographique  de  la  Marine  (SHOM), 
http://www.cnrm.meteo.fr/aladin.  The  horizontal  resolution  is 
0.1°;  hourly  model  outputs  are  available.  This  model  is  further 
referred  to  as  A/adin-FR.  The  predicted  drift  is  obtained  from 
the  following  rule-of-thumb:  the  time  interval  multiplied  by 
3%  of  the  wind  speed,  with  a  direction  15°  to  the  right. 

2.  Aladin /Croatia,  run  by  the  Meteorological  and  Hydrological  Ser¬ 
vice  of  Croatia  (see  Ivatek-Sahdan  and  Tudor  (2004)  and  http:// 
meteo.hr/index_en.php).  The  horizontal  resolution  is  0.03°  and 


Fig.  1.  Trajectories  of  the  drifters  launched  during  DART06.  The  dark  track  corresponds  to  drifter  a06956  studied  later  in  this  paper,  and  called  “track  1"  further  on;  the  first 
week  of  data,  which  Is  effectively  used  in  this  study.  Is  in  red.  All  other  tracks  are  gray.  (For  interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader  is  referred 
to  the  web  version  of  this  article.) 
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the  “time  resolution”  (i.e.  when  the  model  outputs  are  saved  to 
disk)  is  3  h.  This  model  is  further  referred  to  as  Aladin-HR.  The 
predicted  drift  is  again  obtained  by  the  same  rule-of-thumb. 

3.  AdriaROMS,  an  operational  ocean  forecasting  system  for  the 
Adriatic  Sea  run  by  the  HydroMeteorological  Service  of  ARPA 
Emilia  Romagna,  Bologna,  Italy  (see  e.g.  Chiggiato  and  Oddo 
(2008)  and  references  herein,  and  http://www.arpa.emr.it/ 
sim/?mare),  further  referred  to  as  ROMS.  The  resolution  is 
0.025°  and  3  h. 

4.  NRL  (Navy  Research  Laboratory)  regional  Navy  Coastal  Ocean 
Model.  NCOM  was  implemented  over  the  Adriatic  sea  (Martin 
et  al.,  2009),  and  subsamples  were  made  available  in  near 
real-time;  here  we  use  the  area2  subset  covering  the  central 
Adriatic  Sea  only,  with  a  horizontal  resolution  of  0.08°  and  time 
resolution  of  3  h.  It  is  further  referred  to  as  NCOM_D06. 

The  reader  is  referred  to  the  official  documentation  of  the  rele¬ 
vant  operational  centers  or  above  cited  journal  papers  for  descrip¬ 
tions  of  the  models.  All  in  all,  with  the  constant  (bias)  model  added, 
there  are  5  weights  to  determine  in  order  to  obtain  a  linear  HE 
(which  may  be  real  or  complex  numbers  depending  on  the  method 
used),  or  less  if  principal  component  analysis  (PCA,  see  Section  3.3) 
is  applied  beforehand. 

2.2.  MREA07  experiment 

We  also  try  out  the  hyper-ensemble  techniques  with  data  from 
the  MREA07  experiment  in  the  Ligurian  Sea.  This  campaign  also 
aimed  at  collecting  a  vast  amount  of  observations,  and  drifters  data 


were  again  provided  by  NURC  and  OGS.  The  trajectories  are  shown 
in  Fig.  2  [see  (Zanasca  et  al.,  2007)).  We  focus  only  on  the  entire 
track  a74875  later  in  this  study. 

At  the  same  time,  multiple  models  were  applied  to  the  domain. 
We  again  use  two  atmospheric  models  and  two  hydrodynamic 
models  in  our  ensemble.  In  order  to  add  some  complexity,  we  will 
also  include  a  Stokes  drift  model,  even  though  remembering  that  it 
might  be  correlated  to  the  wind  contribution.  Furthermore,  ob¬ 
served  drifter  trajectories  (see  Fig.  2)  indicate  that  the  inertial 
oscillations  are  quite  important.  Hence,  we  also  add  a  synthetic 
model  corresponding  to  a  circular  trajectory.  This  was  not  neces¬ 
sary  in  the  case  of  the  DART06  experiment,  where  the  considered 
drifter  is  mainly  constrained  by  the  relatively  strong  Western  Adri¬ 
atic  Current  (WAC),  leaving  little  contribution  to  inertial  oscilla¬ 
tions.  In  the  Ligurian  Sea,  the  inertial  period  is  about  17.9h.  Of 
course,  this  synthetic  model  by  itself  will  not  be  able  to  represent 
real  drifter  trajectories,  because  it  lacks  the  correct  amplitude  and 
phase.  However,  when  this  is  corrected  for  during  the  training  per¬ 
iod,  and  a  bias  model  is  also  considered,  the  obtained  synthetic 
forecast  may  correspond  surprisingly  well  to  reality,  particularly 
if  other  currents,  winds,  etc.  are  weak.  In  an  ensemble  of  models, 
the  synthetic  model  may  compensate  incorrect  (e.g.  dephased) 
inertial  oscillations  of  some  models. 

All  in  all,  the  following  models  were  used; 

1.  Meteo  France  Aladin  (provided  by  SHOM).  This  model  is  further 
referred  to  as  A/adin-FR;  predicted  drift  is  obtained  from  the 
same  rule-of-thumb  as  before.  Horizontal  and  time  resolution 
are  0.1°  and  1  h,  respectively 
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Fig.  2.  Trajectories  of  the  drifters  launched  during  MREA07.  The  dark  track  corresponds  to  drifter  a74875  (Zanasca  et  al..  2007)  also  studied  later  In  the  paper,  and  called 
“track  5”.  The  two  red  boxes  correspond  to  later  Fig.  1 4  (largest  box)  and  Fig.  16  (smallest  box).  (For  interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader  is 
referred  to  the  web  version  of  this  article.) 
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2.  COSMO-ME  (vsnAW.cosmo-modeI.org/content/tasks/operational/ 
default.htm)  run  operationally  by  CNMCA  -  Italian  Meteorologi¬ 
cal  Service  (http://vsww.meteoam.it),  further  referred  to  as 
COSMO-ME;  and  again  drift  is  obtained  from  the  rule-of-thumb. 
The  resolutions  are  0.03®  and  1  h. 

3.  Mediterranean  Forecasting  System  run  by  INGV,  Bologna,  Italy, 
see  Pinardi  et  al.  (2003)  and  (http://vsww.bo.ingv.it/mfs/)  for 
the  whole  forecasting  system,  and  Tonani  et  al.  (2008)  for  the 
model  itsef,  further  referred  to  as  MFS.  Resolutions  are 
0.0625®  and  1  day. 


4.  NRL  NCOM  (see  Coelho  et  al.  (in  press)),  further  referred  to  as 
NCOM_M07,  with  resolutions  0.005®  and  1  h. 

5.  WaveWatch  111  (SHOM),  further  referred  to  as  CMO  VWV3.  The 
resolution  is  0.1®  and  3  h.  The  predicted  drift  is  obtained  as 
the  time  interval  multiplied  by  the  velocity;  the  latter  is 
obtained  from  the  wave  model  as  3.2^,  where  Hj  is  the  signif¬ 
icant  wave  height  and  Tm  the  mean  period  of  a  broad  spectrum 
of  waves  (Camiel  et  al.,  2002). 

6.  a  synthetic  model  of  inertial  oscillations  with  a 
period  1 7.9  h. 
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Fig.  3.  DART06  experiment:  average  (over  all  segments)  final  (blue)  and  hourly  average  (red)  error  (kml  for  the  drifter  position  after  12  h.  using  various  HE  methods,  during 
the  last  12  h  of  the  training  period  (upper  panel)  and  during  the  forecast  (lower  panel).  The  results  are  averaged  over  all  12-h  segments  of  track  a06956.  (For  interpretation  of 
the  references  to  color  in  this  figure  legend,  the  reader  Is  referred  to  the  web  version  of  this  article.) 
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Thus,  considering  a  bias  model,  at  most  seven  (real  or  complex) 
weights  are  to  be  determined  with  the  HE  methods. 

3.  Hyper-cnscmble  methods 

Super-ensembles  and  hyper-ensembles  are  techniques  which 
aim  at  combining  multiple  models  (of  respectively  the  same  and 
different  physical  processes)  in  order  to  provide  a  forecast  with  a 
higher  skill.  The  optimal  combination  is  obtained  during  a  training 
period,  and  minimizes  the  distance  to  independent  observations. 
Thus.  SE  techniques  can  be  considered  as  data  assimilation  meth¬ 
ods,  as  they  aim  to  optimally  combine  different  sources  of  informa¬ 
tion  (in  this  case,  multiple  models,  and  observations).  The  main 
question  for  these  techniques  is  whether  the  obtained  combination 
will  still  be  optimal  in  the  forecasting  mode,  i.e.  one  needs  to  know 
a  characteristic  time  during  which  the  combination  is  stable,  which 
means,  a  characteristic  time  during  which  none  of  the  model’s  skill 
significantly  changes.  Krishnamurti  et  al.  (1999)  proposed  to  use  an 
unbiased  linear  combination  of  the  available  models,  optimal  (in 
the  least-squares  sense)  with  respect  to  observations  during  a 
training  period  of  a  priori  chosen  length;  all  observations  have  equal 
importance.  Rixen  and  Ferreira -Coe  1  ho  (2007)  applied  the  tech¬ 
nique  in  the  ocean,  also  adding  non-linear  combinations  of  the 
models  (i.e.  using  neural  networks  and  genetic  algorithms),  but 
found  little  improvement  over  the  linear  combination.  This  can  be 
understood  as  the  combination  is  determined  over  the  same  train¬ 
ing  period,  either  by  linear  or  non-linear  methods.  Thus,  not  much 
is  changed  with  respect  to  the  combination  being  (staying)  appro¬ 
priate  (or  not)  in  forecasting  mode.  However,  Shin  and  Krishnamur¬ 
ti  (2003a):  Rixen  et  al.  (in  press)  introduced  dynamically  evolving 
weights  in  a  linear  combination  of  models,  using  data  assimilation 
techniques  (Kalman  filter  and  particle  filter)  adapted  to  the  super¬ 
ensemble  paradigm.  The  latter  techniques  are  able  to  train  the 
weights  on  a  time-scale  corresponding  to  their  natural  characteris¬ 
tic  time,  discarding  older  information  automatically.  The  weight's 
rate  of  change  is  determined  by  the  respective  (and  evolving) 
uncertainties  of  the  weights  themselves,  of  individual  models  and 
of  observations.  Hence,  these  techniques  were  shown  to  yield  sig¬ 
nificantly  better  results  than  more  simple  techniques.  Of  course,  if 


one  desires  to  obtain  a  forecast  further  away  in  the  future  than  this 
characteristic  time,  no  optimal  combination  can  possibly  be  ob¬ 
tained,  and  without  other  a  priori  knowledge,  one  should  probably 
just  use  a  simple  ensemble  mean  of  the  model  forecasts. 

In  the  current  study,  we  try  to  forecast  the  motion  of  surface 
drifters.  Their  position  can  be  elegantly  represented  using  complex 
numbers,  the  longitude  being  the  real  part,  and  the  latitude  the 
imaginary  part.  The  used  HE  methods  are  described  hereunder  in 
the  context  of  our  application. 

3.1.  Individual  models 

The  simplest  SE  technique  is  called  "best  model";  it  simply  se¬ 
lects  the  model  which  performs  best  during  the  whole  training 
period,  and  uses  that  one  to  obtain  the  forecast,  discarding  all  other 
models.  Although  potentially  useful  information  is  neglected,  this 
method  is  often  used  in  practice. 

A  variant  on  this  method  is  to  multiply  each  model  by  a  com¬ 
plex  number  determined  during  the  training  period.  This  corre¬ 
sponds  to  stretching  and  rotating  the  drift  vector  predicted  by 
the  model.  When  considering  wind  models,  the  multiplication  thus 
allows  to  "optimize"  the  rule-of-thumb  mentioned  above  (surface 
drift  velocity  of  3%  of  the  wind  velocity,  IS"*  to  the  right). 

A  third  variant  also  removes  the  bias  by  searching  for  an  optimal 
combination  of  the  considered  model  and  a  synthetic,  constant 
model  (i.e.  bias);  both  models  are  also  multiplied  by  complex  factors. 

3.2.  Ensemble  mean 

The  next  method  is  the  simple  "ensemble  mean".  It  does  not  use 
a  training  period  or  observations  and  thus,  cannot  really  be  consid¬ 
ered  as  a  SE  technique;  however,  it  is  also  a  widely  used  method, 
since  long  known  to  provide  better  forecasts  than  individual  mod¬ 
els  (Kalnay  and  Ham.  1989). 

3.3.  Least-squares  linear  combinations 

Another  technique  consists  of  finding  a  linear  combination  of 
the  models,  minimizing  (in  the  least-squares  sense)  its  departure 
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Fig.  4.  Results  of  the  forecast  by  all  HE  methods  for  a  particular  12-h  segment  in  the  track  a06956  showed  In  Fig.  1,  with  the  training  period  starting  24  h  after  the  drifter’s 
deployment.  The  forecast  starts  at  the  brown  diamond:  the  pink  diamond  represents  the  real  drifter  position  at  the  end  of  the  forecast.  (For  interpretation  of  the  references  to 
color  in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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from  observations  during  the  training  period.  Again,  the  weights 
are  complex  numbers,  which  corresponds  to  stretching  and  rotat¬ 
ing  each  model  in  order  for  the  final  combination  to  be  optimal. 
Two  variants  of  this  method  are  also  used  in  our  study.  First,  we 
add  again  a  constant  model,  thus  adding  an  unbiasing  capability 
to  our  ensemble.  Second,  we  remove  some  of  the  colinearities  be¬ 
tween  the  models.  To  this  purpose,  we  perform  principal  compo¬ 
nent  analysis  (PCA)  on  the  models,  and  decide  to  remove  a 
certain  percentage  of  variability,  e.g.  10%.  For  example,  when  con¬ 


sidering  seven  models,  they  would  be  transformed  into  seven  prin¬ 
cipal  components,  of  which  the  last  2  ones  might  be  discarded.  This 
has  the  further  advantage  of  reducing  the  amount  of  weights  that 
need  to  be  determined  (see  below). 

3.4.  Non-linear  combinations 

Another  class  of  SE  methods  use  non-linear  combinations  of 
models,  e.g.  by  feeding  individual  models  as  input  to  a  neural 
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Fig,  5.  DART06  experiment:  average  final  (blue)  and  hourly  average  (red)  error  (km|  for  the  drifter  position  after  24  h,  using  various  HE  methods,  for  the  hindcast  (upper 
panel)  and  forecast  (lower  panel)  modes.  The  results  are  averaged  over  all  24-h  segments  of  the  track  described  before.  (For  interpretation  of  the  references  to  color  in  this 
figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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network  or  genetic  algorithm.  However,  as  mentioned  before, 
this  does  not  change  the  fundamental  fact  that  the  combination 
is  determined  to  be  optimal  during  a  defined  training  period, 
and  one  just  hopes  that  it  will  still  be  adapted  to  the  forecast 
period.  Even  though  the  non-linear  combination  might  be  better 
than  the  linear  one,  in  practice,  improved  results  in  forecasting 
mode  were  not  observed  (Rixen  and  Ferreira-Coelho,  2007).  This 
might  be  due  to  the  fact  that,  compared  to  the  linear  combina¬ 
tion  (where  one  weight  per  model  has  to  be  determined),  more 
parameters  must  be  determined  for  those  non-linear  methods. 


even  if  one  uses  e.g.  a  neural  network  with  a  relatively  simple 
architecture.  Even  with  linear  methods,  the  more  models  are  in¬ 
cluded  in  the  SE.  the  more  weights  need  to  be  determined,  and 
hence,  smaller  ensembles  may  lead  to  better  results  (for  an  illus¬ 
tration,  see  e.g.  Maeng-Ki  et  al.  (2004)).  Thus,  some  improve¬ 
ments  might  appear  with  non-linear  methods  if  one  has  a 
large  amount  of  observations  during  the  training  period  (and  if 
no  over-fitting  problems  appear).  However,  this  is  not  the  case 
in  our  study,  and  hence,  we  will  not  consider  non-linear  meth¬ 
ods  any  further. 
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Fig.  6.  DART06  experiment:  average  final  (blue)  and  hourly  average  (red)  error  (km]  for  the  drifter  position  after  36  h.  using  various  HE  methods,  for  the  hindcast  (upper 
panel)  and  forecast  (lower  panel)  modes.  (For  Interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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Fig.  7.  Results  of  the  forecast  by  selected  HE  methods  for  a  particular  36-h  segment  in  track  a06956.  Same  color  codes  as  Fig.  4.  (For  interpretation  of  the  references  to  color  in 
this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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Fig.  8.  Evolution  of  model  weights  with  the  ACEKF  filter  as  a  function  of  time  (hours)  from  the  start  of  the  training  period,  corresponding  to  the  first  3  days  of  track  1.  The 
complex  weights  are  represented  by  their  magnitude  and  the  angle  they  form  with  the  eastward  axis  (positive  clockwise). 
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3.5.  Dynamical  methods 

In  all  previous  methods  (except  the  simple  ensemble  mean, 
obviously),  the  length  of  the  training  period  had  to  be  chosen  a  pri¬ 
ori  and  all  observations  during  the  training  period  have  an  equal 
importance.  More  complicated  methods  can  be  thought  of,  e.g. 
where  the  observation’s  importance  decreases  exponentially  with 
time.  However,  it  would  be  more  useful  to  have  a  method  automat¬ 
ically  adapting  the  weights  to  skill  changes  in  models.  This  can  be 
approximated  with  common  data  assimilation  (DA)  techniques: 
starting  from  our  best  guess,  the  weights  are  adapted  during  the 
training  period,  when  observations  are  available,  up  to  present 
time.  Afterward,  the  weights  are  frozen  and  used  during  the  fore¬ 
casting  period.  All  DA  algorithms  could  be  implemented;  we  will  re¬ 
strict  ourselves  to  sequential  DA  and  the  Kalman  filter  (Kalman, 
1960).  As  one  might  easily  get  confused  by  the  unusual  content  of 
the  different  matrices  in  the  Kalman  filter  equations,  we  briefly 
write  them  down  and  explain  them  below: 


Forecast 

X^(t,)  =  MrX‘’(f._,)  (1) 

i^(f,)  =  M,,p“(t..,)M;;  f  (i  (2) 

Analysis 

K  =  l^(t,)H^lR  +  HP^(fi)H^l-'  (3) 

x"(tO  =  xf(U)  +  K[y»  -  Hx'(ti)l  (4) 

P"(f.)  =  -  KHP^(f.)  (5) 


X  is  the  state  vector,  which  contains  the  weights  attributed  to  the 
models  in  the  SE  combination;  its  error  covariance  matrix  is  P. 
Superscript  /  denotes  its  forecasted  state  after  prediction  steps; 
superscript  a  stands  for  analyzed  state  after  the  correction  steps 
using  observations.  We  have  no  a  priori  knowledge  about  the 
weight’s  evolution  in  time,  and  hence,  the  “model”  matrix  M  is  cho¬ 
sen  as  the  identity  matrix  at  all  times;  the  state  vector  prediction 
step  is  trivial.  Another  choice  would  have  been  to  include  an  expo¬ 
nential  decrease  of  the  model  weights  toward  fj,  (N  being  the 


amount  of  models),  or  even  more  complicated  relaxation  schemes. 
In  any  case,  as  weights  obviously  do  evolve  in  time,  the  chosen  con¬ 
stant  model  M  contains  errors;  they  are  represented  by  the  random 
vector  If,  and  have  a  covariance  matrix  Q,  Although  not  mathemat¬ 
ically  constrained,  intuitively,  one  expects  model’s  weights  to  sum 
approximately  to  1,  and  to  lie  somewhere  in  or  close  to  the  (0-1) 
range.  Hence,  we  estimated  a  reasonable  standard  deviation  of  the 
(model)  error  for  individual  weights  to  be  0.1;  the  non-diagonal  ele¬ 
ments  of  0.  are  put  to  zero.  Furthermore,  the  errors  affecting  the 
state  vector  of  weights  have  a  covariance  matrix  denoted  by  P; 
the  initial  standard  deviation  is  chosen  as  0.7  (as  we  expect  a  rela¬ 
tively  bad  initial  guess  of  weights),  and  again,  non-diagonal  ele¬ 
ments  in  Po  are  put  to  zero  (though  they  will  become  non-zero  in 
time).  The  choices  for  the  values  of  Q  and  P  were  validated  by 
cross-correlation.  Let’s  also  note  that  the  prediction  step  for  P  al¬ 
lows  it  to  increase  by  Q  at  each  timestep,  in  accordance  with  our 
intuition  that  the  errors  on  weights  increase  with  time. 

Observations  are  represented  by  the  vector  y;  in  our  case  they 
are  observed  surface  drifts.  The  observation  operator  H  linking 
the  state  vector  space  with  the  observation  space,  contains  the 
individual  model  forecasts  of  surface  drift  (whereas  usually,  when 
one  assimilates  e.g.  temperature  in  a  primitive  equation  model,  H 
is  just  an  interpolation  operator). 

The  observations’  error  covariance  matrix  is  denoted  R,  and 
contains  three  contributions:  instrumental  errors  on  the  observa¬ 
tions  themselves  (supposed  small  in  our  experiments),  representa- 
tivity  errors  due  to  the  fact  that  the  model  does  not  represent  all 
the  physical  processes  included  in  the  observations,  and  errors  in 
the  observation  operator  H.  Thus,  R  essentially  contains  the  (un¬ 
known)  errors  affecting  all  the  individual,  physical  models;  these 
errors  should  be  carefully  estimated  as  R  is  a  critical  parameter 
in  the  filter’s  functioning.  However,  this  is  a  very  difficult  task, 
requiring  also  more  information  than  simply  each  model’s  fore¬ 
cast:  the  errors  and  shortcomings  of  individual  models  are  pre¬ 
cisely  the  reason  why  we  use  an  HE  method  fori  Hence,  in  the 
present  study,  R  was  again  chosen  by  cross-correlation. 

In  oceanography,  usually,  the  state  vector  contains  hundreds  of 
thousands  of  points,  so  that  low-rank  approximations  of  the  Kal- 


trackl 


Fig.  9.  Same  as  Fig.  8  but  for  the  weights  with  the  ACEKF  filter  in  a  singleton  ensemble  comprising  only  the  Aladin  (SHOM)  wind  model. 
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man  filter  must  be  implemented,  such  as  the  SEEK  filter  (Pham 
et  al.,  1998),  the  Ensemble  Kalman  filter  (Evensen,  1994),  etc.  How¬ 
ever,  here,  the  state  vector  is  very  small,  and  hence  the  original, 
complete  Kalman  filter  can  be  implemented.  Thus,  apart  from  the 
hypothesis  of  a  linear  model  and  a  Gaussian  weight  distribution, 
no  further  assumptions  have  to  be  made.  Finally,  it  should  also 
be  noted  that  at  the  end  of  the  training  period,  the  resulting  weight 
vector,  obtained  with  the  Kalman  filter,  is  strictly  identical  to  the 


one  that  would  have  been  obtained  with  the  Kalman  smoother 
(the  same  observations  having  been  taken  into  account)  or  with 
the  4D-Var  filter  (see  e.g.  (Bennett,  1992)]. 

The  equations  written  above  are  valid  for  real  numbers,  and 
hence  we  use  them  with  real  weights  (i.e.  the  individual  models 
are  multiplied  with  a  real  number  before  being  summed  together). 
However,  to  use  complex  numbers  as  with  the  previous  SE  meth¬ 
ods,  the  equations  must  be  adapted  into  the  so-called  Augmented 


average  error  for  track  5 
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Fig.  10.  MREA07  experiment;  average  (over  all  segments)  final  (blue)  and  hourly  average  (red)  error  1km)  for  the  drifter  position  after  12  h.  using  various  HE  methods,  during 
the  hindcast  (upper  panel)  and  the  forecast  (lower  panel).  The  results  are  averaged  over  all  1 2-h  segments  of  the  considered  track.  (For  interpretation  of  the  references  to  color 
in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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Complex  Extended  Kalman  filter  (ACEKF)  (Goh  and  Mandic,  2007), 
where  all  the  initial  vectors  and  matrices,  as  well  as  the  model  ma¬ 
trix,  are  "augmented’*  in  the  following  way: 


with  the  asterisk  denoting  the  complex  conjugate.  Vectors  thus  be¬ 
come  matrices  of  double  length,  and  width  equal  to  2;  matrices 
have  double  length  and  width.  For  our  study,  all  initial  covariance 
matrices  are  chosen  identically  as  above,  but  are  then  augmented. 
During  the  hindcast  period,  the  state  vector  covariance  matrix 
P*“*  progressively  becomes  fully  filled,  with  non-zero  covariances 
between  the  real  and  imaginary  parts. 

Thus,  using  the  ACEKF,  we  have  a  tool  allowing  to  dynamically 
evolve  complex  weights  during  the  hindcast  period,  and  automat¬ 
ically  take  covariances  between  longitude  and  latitude  increments 
into  account.  Finally,  let’s  note  that  the  previously  mentioned 
"tricks”  (unbiasing,  reduction  via  PCA)  can  also  be  applied  for  the 
dynamical  methods;  our  initial  guess  for  the  state  vector  is  simply 
taken  as  the  result  of  the  corresponding  least-squares  linear  com¬ 
bination  method. 

Other  dynamical  methods  can  be  thought  of  For  example,  if  one 
supposes  that  the  weights  of  the  models  in  the  combination  do  not 
have  normal  probability  density  functions,  the  Kalman  filter  should 


not  be  used.  Particle  filters  (see  e.g.  Doucet  et  al.  (2000),  or  van 
Leeuwen  (2003)  for  an  implementation  in  oceanography)  alleviate 
this  hypothesis  of  gaussianity.  In  our  SE  paradigm,  one  particle  is 
one  specific  linear  combination  of  models.  The  cost  is  that  one 
has  to  use  a  relatively  large  ensemble  of  particles  in  order  to  ensure 
convergence.  As  in  our  experiment,  the  model  M  is  the  identity 
model,  this  is  not  necessarily  a  limitation;  however,  in  the  present 
study,  the  most  time-consuming  step  is  the  spatial  and  temporal 
interpolation  in  relatively  massive  (physical)  models  output  files. 
The  results  of  a  standard  Sequential  Importance  Resampling  (SIR) 
filter  were  similar  to  those  of  the  Kalman  filter  (see  Section  4), 
and  about  1000  particles  were  required  for  convergence,  leading 
to  much  longer  computing  times. 

4.  Results 

For  the  two  experiments,  drifter  observations  and  model  fore¬ 
cast  fields  are  interpolated  in  order  to  have  one  position  every 
hour.  Each  hour,  model  velocity  fields  are  also  interpolated  spa¬ 
tially  to  the  exact  drifter  location.  During  the  training  as  well  as 
the  forecast  period,  we  use  model  casts  with  at  least  24  h  forecast 
lead  time.  In  other  words,  we  do  not  use  a  model  hindcast  for  the 
training  period,  but  use  a  forecast  at  least  24  h  old.  This  ensures 
that  models’  skills  are  not  artificially  higher  during  the  training 
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Fig.  n.  Training  and  forecast  using  the  Kalman  filter.  Training  starts  at  the  blue  diamond:  hourly  displacements  predicted  by  each  Individual  model  are  represented  by  a 
colored  segment.  The  actual  forecast  starts  at  the  brown  diamond,  the  pink  diamond  represents  the  real  drifter  position  at  the  end  of  the  forecast.  (For  interpretation  of  the 
references  to  color  in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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tracks 


Fig.  12.  Time-evolution  of  the  absolute  value  and  angle  of  the  weights  obtained  with  the  Kalman  filter  method  shown  in  Fig.  1 1. 


due  to  the  fact  that  data  is  assimilated  during  hindcasts.  Our  train¬ 
ing  period  is  chosen  as  48  h  (keeping  in  mind  that  dynamical  meth¬ 
ods  can  discard  older  information).  Forecasts  are  obtained  for  three 
horizons:  12  h,  24  h  and  36  h. 

4.1.  DART06  expenmenf 

Fig.  3  shows  the  position  error  after  12  h  of  forecast  (blue  bars)’ 
and  the  hourly  mean  error  during  these  12  h  (red  bars),  for  each  of 
the  HE  methods,  averaged  over  the  first  week  (i.e.  five  daily  fore¬ 
casts)  of  the  drifter  track  starting  on  1 1  March  2006  (the  first  track 
in  Fig.  1 ).  when  it  flows  along  the  Gargano  peninsula.  This  track  is 
the  most  rectilinear  one  of  the  experiment,  but  this  does  not  neces¬ 
sarily  make  model  predictions  correspond  more  accurately  with 
observations.  Indeed,  at  the  end  of  the  first  week,  at  least  one  model 
predicted  that  the  drifter  would  hit  the  shore,  which  was  not  the 
case.  The  upper  panel  shows  the  results  in  a  hindcast  period  (i.e.  a 
non-independent  pseudo-forecast  obtained  during  the  last  12h  of 
the  training  period,  which  means  the  weights  should  be  particularly 
well  adapted);  the  lower  panel  shows  the  results  in  the  independent 
forecast.  These  results  are  typical  for  all  the  tracks  in  the  WAC.  After 
12  h,  all  individual  (wind  or  current)  models  have  errors  of  6.5- 
17  km,  and  of  course  perform  equally  well  during  hindcast  and  fore¬ 
cast  (on  average).  In  general,  multiplying  an  individual  model  by  a 
weight  (obtained  during  the  training)  improves  the  hindcast  slightly. 
The  absolute  value  of  the  weights  in  question  is  generally  comprised 
between  0.8  and  1.2;  the  angle  is  small  for  the  ocean  models  and 
sometimes  larger  for  the  wind  models. 

Adding  a  bias  model  improves  the  results  very  significantly, 
with  errors  dropping  to  less  than  1  km  and  2  km  in  hindcast  and 
forecast  mode  respectively.  This  can  be  understood  as  the  trajec¬ 
tory  is  very  linear,  and  hence  the  bias  model  takes  a  lot  of  the 


'  For  interpretation  of  color  in  Figs.  1  -7, 1 0. 1 1 , 1 3, 1 4  and  1 7,  the  reader  is  referred 
to  the  web  version  of  this  article. 


weight  (i.e.  we  are  using  persistence);  the  considered  model  func¬ 
tions  as  a  correction  to  the  bias  or  persistence  model.  In  summary, 
correcting  any  of  the  models  for  bias  and  multiplying  it  with  a 
weight,  yields  much  better  forecasts  than  the  common  "best  mod¬ 
el",  or,  for  that  matter,  "ensemble  mean"  strategies. 

Combining  all  the  models  improves  results  only  slightly  com¬ 
pared  to  unbiased,  weighted  individual  models;  and  adding  the 
PCA  "trick"  does  not  improve  the  forecast  skill  a  lot  either  in  this 
case,  albeit  that  the  latter  method  yields  the  smallest  forecast  error 
of  all  static  methods. 

When  real  weights  are  evolved  during  the  training  period  with 
a  real-number  Kalman  filter,  results  are  relatively  bad  (final  error 
about  8  km).  Indeed,  real  weights  only  allow  stretching  the  drifter 
displacement  vectors  predicted  by  the  model,  but  not  rotating 
them.  When  one  adds  PCA,  the  first  principal  components  are  ori¬ 
ented  toward  the  direction  with  largest  variations,  and  hence  the 
rotation  induced  by  complex  weights  is  less  critical;  results  are 
better,  comparable  to  those  of  the  linear  combination  with  com¬ 
plex  weights.  Finally,  when  one  updates  complex  weights  with 
the  ACEKF,  the  best  results  are  obtained,  and  the  predicted  drifter 
position  is  very  close  on  the  real  position  (error  smaller  than  1  km). 
In  this  case,  adding  PCA  does  not  bring  any  improvement:  the  only 
benefit  would  be  to  remove  redundant  information,  which  appears 
unnecessary  here. 

As  an  example.  Fig.  4  shows  the  results  of  the  forecast  by  all  HE 
methods  for  the  third  12-h  segment  in  the  track  discussed  above. 
The  real  drifter  trajectory  is  represented  in  blue,  with  hourly  data 
represented  by  a  dot.  The  training  stops  at  the  brown  diamond; 
12  h  later,  the  drifter  is  at  the  pink  diamond.  All  four  individual 
models  bring  the  drifter  too  much  southward;  but  the  unbiased, 
weighted,  and  particularly  the  dynamical  methods  can  cope  with 
this  and  correct  the  forecast. 

Figs.  5  and  6  show  the  results  for  24  h  and  36  h  forecasts  respec¬ 
tively.  for  the  same  drifter.  Results  and  comparisons  between  the 
different  HE  methods  are  qualitatively  the  same,  although  of 
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course  the  forecast  error  gets  larger  as  the  forecast  length  in¬ 
creases.  Even  more  than  for  a  12  h  hindcast,  the  36  h  hindcast 
now  almost  coincides  with  the  48  h  training  period,  and  thus  the 
linear  combination  is  yielding  very  good  results  during  this  hind¬ 
cast.  An  example  of  a  36  h  forecast  is  shown  in  Fig.  7.  It  can  be  seen 
that  none  of  the  individual  models  are  very  successful,  hence  the 
ensemble  mean  is  not  accurate  either.  Corrected  individual  models, 


not  shown  in  the  figure  for  clarity,  are  closer  to  the  real  drifter  than 
the  respective  uncorrected  models.  However,  the  ensemble  linear 
combination  is  even  closer,  particularly  when  adding  PCA.  The  real 
Kalman  Filter  is  unable  to  rotate  models,  hence  the  results  are  not 
perfect,  as  explained  higher.  Finally,  one  can  see  that  among  all  HE 
methods,  the  ACEKF  filters  (with  or  without  PCA)  forecast  the  drif¬ 
ter  position  most  accurately. 
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Fig.  13.  MREA07  experiment:  average  final  (blue)  and  hourly-average  (red)  error  lkm|  for  the  drifter  position  after  24  h.  for  both  the  hindcast  (upper  panel)  and  the  forecast 
(lower  panel).  (For  interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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Finally,  to  illustrate  the  concept  of  the  characteristic  time  dur¬ 
ing  which  a  HE  combination  remains  valid.  Fig.  8  shows  the  evolu¬ 
tion  of  the  complex  weights  during  the  first  3  days  of  the 


considered  track.  It  can  be  seen  that  the  weights  undergo  rapid 
changes  starting  at  hour  8;  at  least  one  model  probably  undergoes 
a  strong  change  in  skill  at  that  time.  This  is  verified  using  the  com- 
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Fig.  14.  Results  of  the  forecast  by  selected  HE  methods  for  a  particular  24-h  segment  in  the  track  a74875  (see  the  largest  red  box  In  Fig.  2).  The  training  starts  at  the  blue 
diamond,  the  forecast  at  the  brown  diamond;  the  pink  diamond  represents  the  real  drifter  position  at  the  end  of  the  forecast.  (For  interpretation  of  the  references  to  color  in 
this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 


tracks 


Fig.  15.  Time-evolution  of  the  absolute  value  and  angle  of  the  weights  obtained  with  the  ACEKF  method  with  PCA  (result  showed  in  Fig.  14  in  dashed  black).  (For 
interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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plex  Kalman  filter  but  just  on  single  models.  For  example,  the  ob¬ 
tained  weight  evolution  of  the  Aladin  (SHOM)  wind  model  is 
shown  in  Fig.  9;  it  can  indeed  be  seen  that  from  hour  8,  the  drift 
predicted  by  that  model  has  to  be  strongly  attenuated  (by  about 
20%),  the  adjustment  taking  about  10  h. 

From  Fig.  8,  it  can  be  seen  that  similar  rapid  changes  occur 
around  hours  20  and  25;  but  elsewhere,  and  particularly  after  hour 
25.  the  weights  are  modified  only  slowly.  Thus,  as  only  small 
changes  happen  after  hours  25  (except  the  continuing  adjustment), 
and  onward  to  hour  48  these  changes  become  even  smaller,  one 
can  suppose  that  the  models’  skills  are  relatively  constant  during 
these  23  h.  This  gives  us  some  confidence  to  use  HE  methods  for 
the  forecast,  rather  than  the  ensemble  mean.  In  particular,  for 
the  track  considered  in  Figs.  8  and  9.  the  characteristic  time  of 
HE  validity  is  at  least  24  h.  This  should  be  related  to  the  Lagrangian 
autocorrelation  time,  which  is  about  half  a  day  to  1  day  (Poulain 
and  Zambianchi,  2007;  Rubio  et  al.,  in  press). 

The  absolute  value  of  the  final  weights  obtained  at  hour  48  (the 
end  of  the  training  period)  are  about  0.4  for  NCOM_D06.  and  less 
for  the  three  other  models,  although  no  model  gets  a  negligible 
weight.  Furthermore,  the  bias  model  obtains  about  0.1,  i.e.  the 
same  weight  as  the  ROMS  and  ALADIN  (SHOM)  models.  The  ocean 
models  undergo  relatively  small  rotations,  whereas  the  atmo¬ 
spheric  wind  models  are  turned  by  about  90°. 


4.2.  MREA07  experiment 

The  results  in  the  Ligurian  basin  are  less  straightforward,  as 
could  already  be  expected  from  Fig.  2,  particularly  because  most 
of  the  trajectories  closely  follow  the  coastline;  hence,  an  error  in 
one  of  the  individual  models  could  lead  the  simulated  trajectory 
into  land. 

Fig.  10  shows  the  error  bars  for  ’’track  5"  (shown  in  Fig.  2),  con¬ 
cerning  the  12  h  forecast.  Conclusions  are,  again,  similar  to  those 
obtained  in  the  DART06  experiment.  In  particular,  the  best  results 
are  now  obtained  with  the  real-number  Kalman  filter  with  the  PCA 
trick.  All  HE  methods  yield  better  results  than  the  simple  ensemble 
mean,  except  the  ACEKF  (without  PCA).  In  general,  it  can  be  seen 
that  PCA  reduces  the  forecast  errors.  As  shown  later,  this  is  also 
the  case  of  the  24-h  and  36-h  forecasts.  Hence,  one  might  suspect 
that  some  models  present  colinearities  (which  need  to  be  re¬ 
moved)  or  that  there  are  simply  too  many  weights  (seven  complex 
numbers)  to  be  determined.  For  the  12-h  forecast,  when  comparing 
the  real  and  complex  Kalman  filters  respectively,  the  advantage  of 
having  less  degrees  of  freedom  to  determine  outbalances  the  fact 
that  drift  vectors  can  only  be  stretched,  and  not  rotated. 

An  example  of  result  obtained  with  the  Kalman  Filter  method  is 
detailed  in  Fig.  11;  the  time-evolution  of  the  weights  is  shown  in 
Fig.  12.  One  can  see  from  Fig.  1 1  that  none  of  the  individual  models 
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Fig.  16.  Results  of  the  forecast  by  selected  HE  methods  for  a  particular  24-h  segment  In  the  track  a74fi75  (see  the  smaller  red  box  in  Fig.  2).  Same  color  codes  as  Fig.  14.  The 
NCOM_M07  forecast  extends  to  44°!  5‘N,  9'^3  W  but  is  cut  off  for  clarity.  (For  interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader  is  referred  to  the  web 
version  of  this  article.) 
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is  quite  accurate;  most  predicted  displacements  are  too  small  (ex¬ 
cept  for  NCOM_M07.  which  has  correct  amplitudes  but  is  badly 
orientated  most  of  the  time,  moreover  with  changing  error  direc¬ 
tion).  However,  the  weights  adapt  permanently  to  the  latest  infor¬ 
mation;  one  can  see  that  for  this  particular  segment,  the  SHOM 
(Aladin-France)  model  obtains  a  larger  weight;  furthermore,  the 
bias  also  becomes  more  and  more  important.  The  circular  models 
keep  low  weights  at  all  times,  but  as  the  weight  of  the  COSMO- 
ME  and  even  more  of  the  INCV  MFS  model  are  decreasing  over 


time,  the  latter  ultimately  obtains  a  weight  similar  to  the  synthetic 
inertial  oscillations  model.  Finally,  we  notice  the  very  large  factor 
affecting  the  wave  model;  one  should  remember  that  the  displace¬ 
ment  itself  forecasted  by  this  model  is  much  smaller. 

The  results  for  a  24  h  forecast  are  shown  in  Fig.  13.  During  the 
hindcast,  the  unbiased,  weighted  individual  models,  the  unbiased 
linear  combination  and  the  ACEKF  combination  all  perform  rela¬ 
tively  well  (and  better  than  the  ensemble  mean).  However,  in  fore¬ 
cast  mode,  the  ensemble  mean  method  leads  to  a  smaller  error  than 
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Fig.  17.  MREA07  experiment;  average  final  (blue)  and  hourly-average  (red)  error  (km)  for  the  drifter  position  after  36  h.  for  both  the  hindcast  (upper  panel)  and  the  forecast 
(lower  panel).  (For  interpretation  of  the  references  to  color  in  this  figure  legend,  the  reader  is  referred  to  the  web  version  of  this  article.) 
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the  linear  combination!  With  PCA,  the  linear  combination  is  some¬ 
what  better;  the  Kalman  filter  with  real  weights  also  performs  rea¬ 
sonably  well.  All  this  indicates  that  the  characteristic  time  during 
which  the  obtained  combinations  are  valid,  has  approximately  been 
reached.  The  ACEFK  combination,  where  more  degrees  of  freedom 
are  present,  yields  a  much  larger  error  than  the  real-number  Kal¬ 
man;  again,  PCA  allows  to  somewhat  improve  its  performance. 
Fig.  14  illustrates  this  for  a  particular  segment  starting  20  days  after 
the  drifter  launch  (some  HE  methods  are  not  shown  for  the  clarity 
of  the  figure).  In  this  example,  the  ensemble  mean  and  ensemble 
linear  combination  are  outperformed  by  the  real  Kalman  filter; 
however,  the  weights  in  the  complex  Kalman  filter  do  lead  to  an  er¬ 
ratic  forecast.  With  PCA,  the  obtained  trajectory  is  less  erratic,  but 
still  completely  incorrect.  The  instability  of  complex  weights,  even 
with  PCA,  is  further  illustrated  in  Fig.  1 5  showing  their  time-evolu¬ 
tion;  all  components  weights  undergo  large  variations,  with  each 
component  sometimes  being  important,  sometimes  negligible. 
One  more  example  is  given  in  Fig.  1 6,  starting  26  days  after  the  drif¬ 
ter  launch;  similar  conclusions  again  apply. 

The  situations  gets  even  worse  when  trying  to  predict  the  drift 
at  36  h.  Results  are  shown  in  Fig.  17.  In  forecasting  mode,  the 
ensemble  mean  methods  now  yields  the  smallest  errors;  all  other 
methods  have  errors  of  the  same  order,  or  larger,  as  individual 
models.  This  clearly  indicates  that  the  obtained  combinations  are 
not  valid  anymore  after  (less  than)  36  h;  results  may  be  somewhat 
better  or  somewhat  worse,  depending  purely  on  luck.  For  some 
other  tracks  (not  shown),  the  results  are  somewhat  better,  and 
some  HE  methods  still  perform  relatively  well,  leading  to  results 
similar  or  slightly  better  than  the  ensemble  mean.  However,  one 
might  conclude  that,  using  the  mentioned  models,  the  surface  drift 
predictability  limit  in  the  Ligurian  Sea  during  the  MREA07  experi¬ 
ment  was  somewhere  between  24  and  36  h. 


5.  Conclusion 

In  the  present  study,  we  examined  how  hyper-ensemble  (HE) 
methods  can  improve  the  forecast  of  surface  drift  over  forecasts 
obtained  with  a  single  model,  or  with  the  mean  of  different  models. 
We  used  linear  combinations  of  atmospheric  and  ocean  models,  as 
well  as  a  wave  model  and  synthetic  models  (circular  or  constant, 
corresponding  to  inertial  oscillations  or  to  bias).  We  first  examined 
the  most  common  “combinations”,  such  as  the  ensemble  mean  or 
the  “best  past  model”.  Another  method  is  to  determine  the  value  of 
the  weights  during  a  training  period,  by  least-squares  minimiza¬ 
tion  of  the  distance  to  observed  surface  drift.  We  also  implemented 
the  Kalman  filter,  a  data  assimilation  method  allowing  to  dynami¬ 
cally  change  the  value  of  weights  when  new  observed  drifts  be¬ 
come  available.  The  latter  method  also  allows  to  estimate  a 
characteristic  time  during  which  the  model’s  skills  are  approxi¬ 
mately  constant,  and  hence  help  us  to  decide  whether  or  not  a 
HE  method  should  be  used  or  not. 

Surface  drift  can  be  represented  by  complex  numbers;  further¬ 
more,  if  one  also  uses  complex  weights  in  the  linear  combination, 
this  allows  to  stretch  and  to  rotate  the  predicted  drift  vectors.  The 
Kalman  filter  has  to  be  adapted  for  using  complex  numbers,  lead¬ 
ing  to  the  so-called  ACEKF  filter;  covariances  between  real  and 
imaginary  parts  are  automatically  generated. 

Whenever  the  forecast  period  was  short  enough,  the  HE  lead  to 
strongly  improved  results,  with  the  final  position  error  reduced  by 
at  least  a  factor  3  compared  to  individual  models.  It  was  also 
shown  that  dynamical  methods,  such  as  the  ACEKF,  yield  the 
smallest  forecast  error;  as  mentioned  before,  the  time-evolution 
of  the  weights  also  provides  insight  into  the  HE  and  models  perfor¬ 
mance.  When  many  models  are  available  (seven  in  our  MREA07 
experiment),  it  is  useful  to  reduce  the  amount  of  weights  to  deter¬ 


mine.  e.g.  by  applying  a  principal  component  analysis  and  remov¬ 
ing  colinearities  between  models. 

We  showed  the  benefit  of  adding  one  or  more  synthetic  models 
(a  constant  model  adds  unbiasing  to  the  ensemble;  a  circular  mod¬ 
el  can  add  or  correct  inertial  oscillations).  However,  more  models 
imply  more  degrees  of  freedom  to  determine  during  the  training 
period,  and  this  may  render  the  ensemble  unstable. 

In  general,  forecasting  the  drift  up  to  12  h  is  always  possible  (in 
both  domains),  and  HE  methods  significantly  improve  results  over 
individual  models.  In  particular,  adding  a  synthetic  bias  model  to  a 
single  weighted  model  strongly  decreases  errors,  indicating  that  at 
this  forecasting  timescale,  persistence  is  very  useful.  Adding  more 
models  and  combining  them  with  dynamical  methods  such  as  the 
Kalman  filter  allows  to  further  improve  results.  However,  after  24. 
and  especially  36  h,  forecasting  might  become  more  problematic, 
at  least  in  a  complex  environments  with  strong  meso-  and  smaller 
scale  eddy  activity,  such  as  the  Ligurian  Sea.  Indeed,  we  showed  that 
model  skills  may  change  significantly  over  such  a  time  period,  and 
hence  the  weighted  combination  of  models  obtained  during  the 
training  period  is  not  optimal  during  the  whole  forecasting  period. 
In  particular,  adding  a  bias  model  to  a  single  model  does  not  increase 
skill,  indicating  that  persistence  is  not  useful  anymore,  and  that  the 
role  of  primitive  equation  models  become  truly  crucial  for  scales 
longer  than  12  h:  the  issue  of  good  model  performance  cannot  be 
avoided  by  super-ensembles  1  In  the  Ligurian  Sea,  HE  methods  per¬ 
formed  poorly  for  24  or  26  h  forecasts,  and  a  simple  ensemble  mean 
or  an  unbiased  linear  combination  lead  to  better  results  than  a  Kal¬ 
man  filter  method.  Hence,  it  might  be  better  to  use  “average” 
weights  obtained  during  a  longer  training  period  rather  than  adapt¬ 
ing  to  the  most  recent  data.  However,  the  Kalman  filter  methods  at 
least  allow  to  know  how  fast  weights  change  in  time,  so  that  one 
can  decide  which  HE  technique  to  use.  Thus,  when  observations 
and  different  models  are  available,  we  recommend  to  implement  a 
dynamical  HE  method  such  as  the  Kalman  filter,  and  to  examine 
the  time-evolution  of  the  weights.  If  they  are  stable  during  a  lapse 
of  time  corresponding  to  the  desired  forecast  horizon,  then  the  re¬ 
sults  of  the  dynamical  method  should  be  used.  If  they  are  stable  dur¬ 
ing  a  much  longer  period,  methods  with  a  priori  fixed  training 
lengths  will  yield  approximately  the  same  results.  To  the  contrary, 
if  the  weights  are  varying  very  quickly  (compared  to  the  desired  lead 
time  of  the  forecast),  one  should  use  average  weights. 

In  the  case  of  a  2-dimensional  variable  such  as  surface  velocity 
(or  drift),  the  question  whether  one  should  use  real  or  complex 
weights  depends  on  the  size  of  the  ensemble  (i.e.  the  degrees  of 
freedom  to  be  fixed).  Generally,  complex  weights  provide  better  re¬ 
sults  as  model-predicted  drifts  can  be  both  multiplied  and  rotated. 
However,  twice  as  many  parameters  are  to  be  fixed  during  the 
training  period.  If  many  different  models  are  present,  or  insuffi¬ 
cient  training  data  is  available,  one  could  then  obtain  better  results 
with  real  weights.  PCA  generally  helps  to  decrease  the  amount  of 
degrees  of  freedom,  and  might  thus  allow  using  complex  weights 
where  otherwise,  real  weights  would  have  led  to  the  best  results. 

In  order  to  use  hyper-ensembles  operationally,  one  needs  to 
centralize  all  the  forecasts,  as  well  as  the  observations.  The  HE 
computations  themselves  are  performed  very  fast;  a  large  part  of 
the  effort  goes  to  correctly  reading  and  interpolating  the  forecasts 
from  the  individual  models  into  the  HE  algorithm.  Provided  that 
these  issues  are  resolved,  an  operational  HE  forecast  can  be  pro¬ 
vided,  as  has  already  been  demonstrated  (Rixen  et  al.,  2008).  As 
more  models  are  implemented  in  various  regions,  we  hope  the 
HE  techniques  will  improve  forecasts  at  reduced  cost. 
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